Three discretization methods for rule induction

نویسندگان

  • Jerzy W. Grzymala-Busse
  • Jerzy Stefanowski
چکیده

We discuss problems associated with induction of decision rules from data with numerical attributes. Real-life data frequently contain numerical attributes. Rule induction from numerical data requires an additional step called discretization. In this step numerical values are converted into intervals. Most existing discretization methods are used before rule induction, as a part of data preprocessing. Some methods discretize numerical attributes while learning decision rules. We compare the classification accuracy of a discretization method based on conditional entropy, applied before rule induction, with two newly proposed methods, incorporated directly into the rule induction algorithm LEM2, where discretization and rule induction are performed at the same time. In all three approaches the same system is used for classification of new, unseen data. As a result, we conclude that an error rate for all three methods does not show significant difference, however, rules induced by the two new methods are simpler and stronger. 2001 John Wiley & Sons, Inc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experimental Evaluation of Discretization Schemes for Rule Induction

This paper proposes an experimental evaluation of various discretization schemes in three different evolutionary systems for inductive concept learning. The various discretization methods are used in order to obtain a number of discretization intervals, which represent the basis for the methods adopted by the systems for dealing with numerical values. Basically, for each rule and attribute, one...

متن کامل

A Comparison of Three Strategies to Rule Induction from Data with Numerical Attributes

Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm. The LEM2 algorithm works correctly only for symbolic attributes and is a part of the LERS data...

متن کامل

A Tuning Aid for Discretization in Rule Induction

This paper examines where a tuning aid can be useful to help discretization of numerical attributes in rule induction, and subsequently improve deduction of induction results. Diierent discretizationmethods use diierent strategies to set up the borders for continuous attributes. They mostly incorporate class supervision to deene the discretization borders. The tuning aid we present uses an unsu...

متن کامل

Reduct Calculation and Discretization of Numeric Attributes in Sparse Decision Systems

In this paper we discuss three problems in Data Mining Sparse Decision Systems: the problem of short reduct calculation, discretization of numerical attributes and rule induction. We present algorithms that provide approximate solutions to these problems and analyze the complexity of these algorithms.

متن کامل

Three Strategies to Rule Induction from Data with Numerical Attributes

Rule induction from data with numerical attributes must be accompanied by discretization. Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm, work...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. Intell. Syst.

دوره 16  شماره 

صفحات  -

تاریخ انتشار 2001